Quick Start: 3 Steps

TIMEOR accepts 2 input types: (1) raw .fastq files and SraRunTable (e.g. here) or a (2) RNA-seq time-series read count matrix (e.g. here) and metadata file (e.g. here).

  1. Visit https://timeor.brown.edu.
  2. For (1) in ‘Example Data’ (side-bar) under ‘Load raw data’ click the ‘SraRunTable & .fastq files’ button. This will guide you through the ‘Process Raw Data’ tab demo. Follow pop-ups and fill in grey boxes. See Run TIMEOR below for walk-through.
  3. Next, for (2) in ‘Example Data’ (side-bar) under ‘Load count matrix’ click the ‘Metadata & read count file’ button. This will guide you through the rest of the full method demo. Follow pop-ups and fill in grey boxes. See Run TIMEOR below for full application walk-through.

Website

Computational Biology Core at Brown Univeristy and DRSC/TRiP Functional Genomics Resources at Harvard Medical School Partnership

TIMEOR is available online at https://timeor.brown.edu.

Run TIMEOR

Two ways to input data:

  1. Import SraRunTable from GEO* where TIMEOR will process raw data through retrieving .fastq files, quality control, alignment, and read count matrix creation. Read this section below.

  2. Import metadata file** and count matrix *** (skipping raw data retrieval, quality control, alignment, and read count matrix creation) and proceeding straight to normalization and correction. Read this section below.

Then simply follow the prompts. Fill out the grey boxes to begin interacting with each stage and tab.

Input file types:

* SraRunTable from GEO follow instructions in TIMEOR first tab (“Process Raw Data”)

** metadata file requires at least these columns. - ID, condition, time, batch - ID: a unique identifier (ID) for the user (e.g. case_1min_rep1) - condition: one word description (e.g. case, control) - time: numerical values e.g. (0, 20, 40) - batch: string description of batch (e.g. b1, b2, b3)

*** count matrix : rows should be unique gene identifiers (e.g. Flybase, Ensembl or Entrez IDs) and columns should be the IDs from metadata file.

Run TIMEOR from Raw Data: Starting from .fastq Time-Series RNA-seq

This tutorial uses a subset of real data used in the TIMEOR publication to take the user through TIMEOR’s “Process Raw Data” tab.

  1. In the far-left navigation bar click on “Example Data” and then under “Load real data” click on “SraRunTable & raw data”.

 

 

  1. Follow the pop-up prompt to explore the default settings to questions 1-6 to set the adaptive default parameters, and then click the “Run” button to begin retrieving the raw data (SRR8843738 and SRR8843750), performing quality control, and aligning the reads using HISAT2 and Bowtie2.

 

 

 

 

  1. Once the data have been retrieved and quality control has finished. You can view a summary under the “Quality Control” panel on the right. Interactive results can be downloaded and viewed from MultiQC.

 

 

 

 

  

  1. You can explore the alignment results between both methods in the “Alignment Quality” panel. Note that HISAT2 is splice-site aware). You can choose the method and then click “Generate count matrix” to have TIMEOR generate the read count matrix for the next tab “Load Count Matrix”.

 

 

  

Run TIMEOR Using Simulated Data: Starting from Read Count Matrix

This tutorial uses simulaated data and takes the user through TIMEOR’s functionality. NOTE: figures with two panels are the same page, split.

  1. In the far-left navigation bar click on “Example Data” and then under “Load simulated data” click on “Metadata & count matrix”.

 

 

  1. Follow the pop-up prompt to explore results on each Pre-processing tab (Process Raw Data, Load Count Matrix, and Normalize and Correct Data).

 

 

 

 

  1. On the Normalize and Correct Data tab, choose from normalization and correction methods and click “Run” to view result.

 

 

  1. Proceed to Primary Analysis and click “Run”

  2. At the bottom right you will see a notification to click “Render Venn Diagram” in the top right to compare differential expression results between three methods (ImpulseDE2, Next maSigPro, and DESeq2). See figure below, blue box, bottom left.

  3. Download prev_study.txt to then upload using the Browse button to compare a previous study with the three differential expression results. See figure below, left.

  4. Examine differential expression method results in the bottom row. Toggle under “Display Desired Differential Expression Method Results” between ImpulseDE2, Next maSigPro, and DESeq2 on the left, and the interactive clustermap with automated clustering will display the differentially expressed gene trajectories for the chosen method. See figure below, right.

  5. Toggle under “Cluster Gene Expression Trajectories” to choose the number of clusters desired. See figure below, right.

 

 

  1. For this demo, please choose ImpulseDE2 and automatic clustering before proceeding to Secondary Analysis. These results can be processed efficiently. NOTE: ImpulseDE2 is chosen because it has the largest differential expressed gene overlap with the previous study and other methods.

  2. Under Gene Expression Trajectory Clusters choose cluster 1, 2, or 3 in the dropdown. On the right under “Chosen Cluster Gene Set” you will see the genes in that cluster. They appear in the same color as the cluster.

  3. Once you have chosen which genes set to test for enrichment, click the “Analyze” toggle to “ON”.

  4. Wait to view any enriched gene ontology (GO) terms (Molecular Function, Biological Process, or Cellular Component), pathway, network, and/or motif analysis. NOTE: you may download the interactive motif results for viewing.

  5. Toggle the “Analyze” button to “OFF” to choose another gene set, and repeat steps 10-13.

 

 

  1. Proceed to the Factor Binding tab to view the perturbed and top predicted transcription factors in each gene cluster (under “Perturbed and Top 4 Predicted Transcription Factors to Bind Each Cluster”).

  2. In that same table on the right you will see ENCODE IDs indicating published ChIP-seq data for the predicted transcription factors. You may download the .bigWig files here (ENCFF467OWR, ENCFF609FCZ, ENCFF346CDA) or follow the prompts in the grey box under “Upload .bigWig Files”. If you are interested, click on the “+” under “Details about individual method predicted transcription factors” to see the ranked lists of transcription factors and motifs by method. NOTE: you can download the interactive cluster motif results to view all motifs. NOTE: blanks indicate no transcription factor consensus among all methods.

 

 

  1. Under “Average Profiles Across Each Gene Expression Trajectory Cluster”, in the first box type “stat92e”, upload ENCFF467OWR.bigWig, and click “Go”. In the second box type “pho”, upload ENCFF609FCZ.bigWig, and click “Go”. In the third box type “CG7786”, upload ENCFF346CDA.bigWig, and click “Go”. You will see 3 average profile distribution plots, one for each cluster, and easily distinguishable by color (same as in clustermap).

 

 

  1. Proceed to the last tab to view the temporal relations between transcription factors. On the first row you are reminded of the perturbed and predicted transcription factors to bind each gene cluster. On the second row to the left you will see “Transcription Factor Network of bith perturbed and predicted transcription factors. On the right (”Temporal Relations Between Perturbed and Top Predicted Transcription Factors") you will see a table highlighting the temporal relations between transcription factors. 5 different temporal relationships are identified and represented in the legend (far-right).

  2. On the third row (“Network Customization: move and add desired genes to describe temporal relation”) the user can use this information to create a customized network to temporally relate all transcription factors and other genes. Do so by clicking “Search” and then “Multiple proteins”.

  3. Your results folder can be downloaded on the far-left side under “Download Results Folder”. NOTE: The original simulated data and results can be downloaded here.

     

     

Details

Real Data Subset in Tutorial

The original temporal RNA-seq data analyzed in our paper comes from Zirin et al., 2019). In this tutorial SRR8843750 and SRR8843738 are analyzed to demonstrate the “Process Raw Data” tab in which raw RNA-seq data are retrieved, quality checked, aligned (with HISAT2 and Bowtie2), and converted to a read count matrix. The real data subset folder (which TIMEOR automatically generates) can be downloaded here.

Simulated Data in Tutorial

The original simulated data folder can be downloaded here.

Secondary Analysis: Factor Binding

To get the top 4 TFs a 25% concensus threshold was used, with a normalized enrichement score threshold of 3.

Command used: Rscript get_top_tfs.r /PATH/TO/simulated_results/ dme 3 4 25 /PATH/TO/TIMEOR/

The following bigWig files were collected:

  • ENCFF467OWR (read-depth normalized signal between both replicates) within dataset ENCSR240ADR for Stat92E

  • ENCFF609FCZ (read-depth normalized signal between both replicates) within dataset ENCSR681YMA for pho

  • ENCFF346CDA (read-depth normalized signal between all three replicates) within dataset ENCSR776AVR for CG7786

Real Data in Publication

The results presented in TIMEOR’s publication can be downloaded in TIMEOR’s automatically generated folders here.